# Adding a Custom Task

Lighteval provides a flexible framework for creating custom evaluation tasks. This guide explains how to create and integrate new tasks into the evaluation system.

## Step-by-Step Creation of a Task

> [!WARNING]
> To contribute your task to the Lighteval repository, you would first need
> to install the required dev dependencies by running `pip install -e .[dev]`
> and then run `pre-commit install` to install the pre-commit hooks.

### Step 1: Create the Task File

First, create a Python file or directory under the `src/lighteval/tasks/tasks` directory.
A directory is helpfull if you need to split your file into multiple ones, just make sure to have one of the file named `main.py`.

### Step 2: Define the Prompt Function

You need to define a prompt function that will convert a line from your
dataset to a document to be used for evaluation.

```python
from lighteval.tasks.requests import Doc

# Define as many as you need for your different tasks
def prompt_fn(line: dict, task_name: str):
    """Defines how to go from a dataset line to a doc object.
    Follow examples in src/lighteval/tasks/default_prompts.py, or get more info
    about what this function should do in the README.
    """
    return Doc(
        task_name=task_name,
        query=line["question"],
        choices=[f" {c}" for c in line["choices"]],
        gold_index=line["gold"],
    )
```

### Step 3: Choose or Create Metrics

You can either use an existing metric (defined in [`lighteval.metrics.metrics.Metrics`]) or [create a custom one](adding-a-new-metric).

#### Using Existing Metrics

```python
from lighteval.metrics import Metrics

# Use an existing metric
metric = Metrics.ACCURACY
```

#### Creating Custom Metrics

```python
from lighteval.metrics.utils.metric_utils import SampleLevelMetric
import numpy as np

custom_metric = SampleLevelMetric(
    metric_name="my_custom_metric_name",
    higher_is_better=True,
    category="accuracy",
    sample_level_fn=lambda x: x,  # How to compute score for one sample
    corpus_level_fn=np.mean,  # How to aggregate the sample metrics
)
```

### Step 4: Define Your Task

You can define a task with or without subsets using [`~tasks.lighteval_task.LightevalTaskConfig`].

#### Simple Task (No Subsets)

```python
from lighteval.tasks.lighteval_task import LightevalTaskConfig

# This is how you create a simple task (like HellaSwag) which has one single subset
# attached to it, and one evaluation possible.
task = LightevalTaskConfig(
    name="myothertask",
    prompt_function=prompt_fn,  # Must be defined in the file or imported
    hf_repo="your_dataset_repo_on_hf",
    hf_subset="default",
    hf_avail_splits=["train", "test"],
    evaluation_splits=["test"],
    few_shots_split="train",
    few_shots_select="random_sampling_from_train",
    metrics=[metric],  # Select your metric in Metrics
    generation_size=256,
    stop_sequence=["\n", "Question:"],
)
```

#### Task with Multiple Subsets

If you want to create a task with multiple subsets, add them to the
`SAMPLE_SUBSETS` list and create a task for each subset.

```python
SAMPLE_SUBSETS = ["subset1", "subset2", "subset3"]  # List of all the subsets to use for this eval

class CustomSubsetTask(LightevalTaskConfig):
    def __init__(
        self,
        name,
        hf_subset,
    ):
        super().__init__(
            name=name,
            hf_subset=hf_subset,
            prompt_function=prompt_fn,  # Must be defined in the file or imported
            hf_repo="your_dataset_name",
            metrics=[custom_metric],  # Select your metric in Metrics or use your custom_metric
            hf_avail_splits=["train", "test"],
            evaluation_splits=["test"],
            few_shots_split="train",
            few_shots_select="random_sampling_from_train",
            generation_size=256,
            stop_sequence=["\n", "Question:"],
        )

SUBSET_TASKS = [CustomSubsetTask(name=f"task:{subset}", hf_subset=subset) for subset in SAMPLE_SUBSETS]
```

### Step 5: Add Tasks to the Table

Then you need to add your task to the `TASKS_TABLE` list.

```python
# STORE YOUR EVALS

# Tasks with subsets:
TASKS_TABLE = SUBSET_TASKS

# Tasks without subsets:
# TASKS_TABLE = [task]
```

### Step 6: Creating a requirement file

If your task has requirements, you need to create a `requirement.txt` file with
only the required dependencies so that anyone can run your task.

## Running Your Custom Task

Once your file is created, you can run the evaluation with the following command:

```bash
lighteval accelerate \
    "model_name=HuggingFaceH4/zephyr-7b-beta" \
    {task} \
    --custom-tasks {path_to_your_custom_task_file}
```

### Example Usage

```bash
# Run a custom task with 3 shot evaluation
lighteval accelerate \
    "model_name=openai-community/gpt2" \
    "myothertask|3" \
    --custom-tasks community_tasks/my_custom_task.py
```
